Classification of Health Webpages as Expert and Non Expert with a Reduced Set of Cross-language Features

نویسندگان

  • Natalia Grabar
  • Sonia Krivine
  • Marie-Christine Jaulent
چکیده

Making the distinction between expert and non expert health documents can help users to select the information which is more suitable for them, according to whether they are familiar or not with medical terminology. This issue is particularly important for the information retrieval area. In our work we address this purpose through stylistic corpus analysis and the application of machine learning algorithms. Our hypothesis is that this distinction can be performed on the basis of a small number of features and that such features can be language and domain independent. The used features were acquired in source corpus (Russian language, diabetes topic) and then tested on target (French language, pneumology topic) and source corpora. These cross-language features show 90% precision and 93% recall with non expert documents in source language; and 85% precision and 74% recall with expert documents in target language.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Socio-Cultural and Technical Issues in Non-Expert Dubbing: A Case Study

Advances in computer sciences and the emergence of innovative technologies have entered numerous new elements of change in translation industry, such as the inseparable usage of software programs in audiovisual translation. Initiated by the expanding reality of fandubbing in Iran, the present article aimed at illuminating this practice into Persian in the Iranian context to partly address the l...

متن کامل

Designing an expert system for differential diagnosis of β-Thalassemia minor and Iron-Deficiency anemia using neural network

Introduction: Artificial neural networks are a type of systems that use very complex technologies and non-algorithmic solutions for problem solving. These characteristics make them suitable for various medical applications. This study set out to investigate the application of artificial neural networks for differential diagnosis of thalassemia minor and iron-deficiency anemia. Methods: It is...

متن کامل

A Comparison of Expert and Novice Iranian EFL Teachers’ Procedural Knowledge in Iranian Language Institutes and Universities

This study sought to compare Iranian EFL novice and expert teachers regarding their procedural knowledge in Iranian language institutes and universities. A questionnaire was developed based on the literature, the theoretical framework, and the results of a qualitative study. This questionnaire was administered to the whole sample of the study who was 200 Iranian EFL teachers from different gend...

متن کامل

A Comparison of Professional Knowledge between Expert and Novice Iranian EFL Teachers at Iranian Language Institutes and Universities

This study sought to compare Iranian EFL novice and expert teachers regarding their professional knowledge at Iranian language institutes and universities. To achieve the aim of study, a questionnaire was developed based on the literature, the theoretical framework and the results of a qualitative study. This questionnaire was administered to the whole sample of the study who was 200 Iranian EF...

متن کامل

Expert and Novice Iranian EFL Teachers’ Professional Knowledge at Iranian Language Institutes and Universities

This study sought to compare Iranian EFL novice and expert teachers regarding their professional knowledge at Iranian language institutes and universities. To achieve the aim of study, a questionnaire was developed based on the literature, the theoretical framework and the results of a qualitative study carried out by Yazdanpanah and Sahragard (2017). This questionnaire was administered to the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • AMIA ... Annual Symposium proceedings. AMIA Symposium

دوره   شماره 

صفحات  -

تاریخ انتشار 2007